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ABSTRACT 

"['lie advancement in die field of medical imaging system lias lead industries to conceptualize a 
complete automated system for the medical procedures, diagnosis, treatment and prediction. The 
success of such system largely depends upon the robustness, accuracy and speed of the retrieval 
systems. Content based image retrieval (CBIR) system is valuable in medical systems as it provides 
retrieval of the images from die large daiasei based on similarities. There is acontinuous research 
going on in the area of CBIR systems typically for medical images, which provides a successive 
algorithm development for achieving generalized methodologies, which could be widely used. The aim 
of this paper is to discuss the various techniques, the assumptions and its scope suggested by various 
researchers and further setup a roadmap for research in the field of CBIR system for medical image 
database. 
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I. INTRODUCTION 

Content-based image retrieval (CBIR) is the application of computer vision techniques to the problem 
of digital image search in large databases. CBIR enables to retrieve the images from the databases [1, 2]. 
Medical images are usually fused, subject to high inconsistency and composed of different minor structures. So 
there is a necessity for feature extraction and classification of images for easy and efficient retrieval [3]. CBIR is 
an automatic retrieval of images generally based on some particular properties such as color Composition, 
shape and texture [4, 5]. Every day large volumes of different types of medical images such as dental, 
endoscopy, skull, MRI, ultrasound, radiology are produced in various hospitals as well as in various medical 
centres [6]. Medical image retrieval has many significant applications especially in medical diagnosis, education 
and research fields. Medical image retrieval for diagnostic purposes is important because the historical images 
of different patients in medical centres have valuable information for the upcoming diagnosis with a system 
which retrieves similar cases, 

Make more accurate diagnosis and decide on appropriate treatment. The term content based image 
retrieval was introduced by Kato[7], while describing his experiment of image retrieval from a database on the 
basis of color and shape features. There is a significant amount of growing image databases in medical field 
images. It is a proven though that for supporting clinical decision making the integration of content based access 
method into Picture Archiving and Communication Systems (PACS) will be a mandatory need [8]. In most 
biomedical disciplines, digital image data is rapidly expanding in quantity and heterogeneity, and there is an 
increasing trend towards the formation of archives adequate to support diagnostics and preventive medicine. 
Exploration, exploitation, and consolidation of the immense image collections require tools to access 
structurally different data for research, diagnostics and teaching. Currently, image data is linked to textual 
descriptions, and data access is provided only via these textual additives. There are virtually no tools available to 
access medical images directly by their content or to cope with their structural differences. Therefore, visual 
based (i.e. content-based) indexing and retrieval based on information contained in the pixel data of biomedical 
images is expected to have a great impact on biomedical image databases. However, existing systems for 
content-based image retrieval (CBIR) are not applicable to the biomedical imagery special needs, and novel 
methodologies are urgently needed. 

Content based image retrieval (CBIR) has received significant attention in the literature as a promising 
technique to facilitate improved image management in PACS system [9, 10]. The Image Retrieval for Medical 
Applications (IRMA) project [10,1 1] aims to provide visually rich image management through CBIR techniques 
applied to medical images using intensity distribution and texture measures taken globally over the entire image. 
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This approach permits queries on a heterogeneous image collection and helps in identifying images that 
are similar with respect to global features. Section 2 highlights about the significance of CBIR in medical 
imaging followed by methods used for implementation CBIR in Section 3. The recent work done on CBIR is 
mentioned in Section 4. The issues or research gap from prior work is illustrated in Section 5 followed 
conclusion in Section 6. 

II. SIGNIFICANCE OF CBIR IN MEDICAL IMAGING 

There are several reasons why there is a need for additional, alternative image retrieval methods apart 
from the steadily growing rate of image acquired every day.lt is important to explain these needs and to discuss 
possible technical and methodological improvements and the resulting clinical benefits. The goals of medical 
information systems have often been defined to deliver the needed information at the right time, the right place 
to the right persons in order to improve the quality and efficiency of care processes [12]. Such a goal will most 
likely need more than a query by patient name, series ID or study ID for images. For the clinical decision 
making process it can be beneficial or even important to find other images of the same modality, the same 
anatomic region of the same disease. Although part of this information is normally contained in the Digital 
Imaging and Communication in Medicine (DICOM) headers and many imaging devices are DICOM compliant 
at this time, there are still some problems. DICOM headers have proven to contain a fairly high rate of errors, 
for example for the field anatomical region, error rates of 16% have been reported [13]. This can hinder the 
correct retrieval of all wanted images. Clinical decision support techniques such as case based reasoning [14] or 
evidence based medicine [15,16] can even produce a stronger need to retrieve images that can be valuable for 
supporting certain diagnoses. It could even be imagined to have Image Based Reasoning (IBR) as a new 
discipline for diagnostic aid. Decision support systems in radiology [17] and computer aided diagnostics for 
radiological practice as demonstrated at the RSNA (Radiological Society of North America) [18] are on the rise 
and create a need for powerful data and metadata management and retrieval. 

The general clinical benefit of imaging system has also already been demonstrated by B. Kaplan et.al 
[19]. An initiative is described by A. Horsch et.al to identify important tasks for medical imaging based on their 
possible clinical benefits. It needs to be stated that the purely visual image queries as they are executed in the 
computer vision domain will most likely not be able to ever replace text based methods as there will always be 
queries for all images of a certain patient, but they have the potential to be a very good complement to text 
based search based on their characteristics. Still, the problems and advantages of the technology have to be 
stressed to obtain acceptance and use of visual and text based access methods up to their full potential. A 
scenario for hybrid, textual and visual queries is put forward for the CBIR system by S. Antani et.al 
[21]. Besides diagnostics, teaching and research especially are expected to improve through the use of visual 
access methods as visually interesting images can be chosen and can actually be found in the existing large 
repositories. The inclusion of visual features into medical studies is another interesting point for several medical 
research domains. Visual features do not only allow the retrieval of cases with patients having similar diagnoses 
but also cases with visual similarity but different diagnoses. In teaching it can help lecturers as well as students 
to browse educational image repositories and visually inspect the results found. This can be the case for 
navigating in image atlases. It can also be used to cross correlate visual and textual features of the images. 

III. METHODS USED FOR IMPLEMENTING CBIR 

Content-based image retrieval hinges on the ability of the algorithms to extract pertinent image features 
and organize them in a way that represents the image content. Additionally, the algorithms should be able to 
quantify the similarity between the query visual and the database candidate for the image content as perceived 
by the viewer. Thus, there is a systemic component to CBIR and a more challenging semantic 
component. □ □ Shape Based Method: For shape based image retrieval, the image feature extracted is usually an 
N dimensional feature vector which can be regarded as a point in a N dimensional space. Once images are 
indexed into the database using the extracted feature vectors, the retrieval of images is essentially the 
determination of similarity between the query image and the target images in database, which is essentially the 
determination of distance between the feature vectors representing the images. The desirable distance measure 
should reflect human perception. Various similarity measures have been exploited in image retrieval. In one of 
the implementation by Muller .H they have used Euclidean distance for similarity measurement. [22] U □ Texture 
Based Method: Texture measures have an even larger variety than color measures. Some common measures for 
capturing the texture of images are wavelets and Gabor filters. The texture measures try to capture the 
characteristics of images or image parts with respect to changes in certain directions and scale of changes. This 
is most useful for regions or images with homogeneous texture. 
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U U Continuous Feature Selection Method: This method deals with the "dimensionality curse" and the 
semantic gap problem. The method applies statistical association rule mining to relate low-level features with 
high-level specialist's knowledge about the image, in order to reduce the semantic gap existing between the 
image representation and interpretation. These rules are employed to weigh the features according to their 
relevance. The dimensionality reduction is performed by discarding the irrelevant features (the ones whose 
weight are null). The obtained weights are used to calculate the similarity between the images during the 
content-based searching. Experiments performed show that the proposed method improves the query precision 
up to 38%. Moreover, the method is efficient, since the complexity of the query processing decreases along the 
dimensionality reduction of the feature vector. □ □ With Automatically Extracted MeSH Terms: There is still a 
semantic gap between the low-level visual features(textures,colors) automatically extracted and the high level 
concepts that users normally search for (tumors, abnormal tissue) [22]. 

Some solutions stated by Juan C Caicedo et.al to bridge the semantic gap are the connection of visual 
features to known textual labels of the images [23] or the training of a classifier based on known class labels and 
the use of the classifier on unknown cases is also discussed by Jia Li et.al. [24]. Combinations of textual and 
visual features for medical image retrieval have as of yet rarely been applied, although medical images in the 
electronic patient record or case databases basically always do have text attached to them. The complementary 
nature of text and visual image features for retrieval promises to lead to good retrieval results. U U Using Low 
Level Visual Features and The image retrieval process consists of two main phases: pre-processing phase and 
retrieval phase. Both phases are described as follows. The pre-processing phase is composed of two main 
components: a feature extraction model and a classification model. The input of the pre-processing phase is the 
original image database, i.e. images from the ImageCLEFmed collection, with more than 66,000 medical 
images. The output of the pre-processing phase is an index relating each image to its modality and a feature 
database. U U The Feature Extraction Model: The feature extraction model operates on the image database to 
produce two kinds of features: histogram features and Meta features. Histogram features are used to build the 
feature database, which is used in the retrieval phase to rank similar images. Met features are a set of histogram 
descriptors, which are used as the input to the classification model to be described later. Flistogram features used 
in this system are: 

o Gray scale and color histogram (Gray and RGB) 
o Local Binary Partition histogram (LBP) 
o Tamura texture histogram (Tamura) 
o Sobel histogram (Sobel) 

o Invariant feature histogram (Invariant) Meta features are calculated from histogram features in order to reduce 
the dimensionality. These meta features are the four moments of the moment generating function (mean, 
deviation, skewness and 

kurtosis) and the entropy of the histogram. Each histogram has five associated meta features, meaning a total of 
30 meta features with information of color, texture, edges and invariants. 

IV. RECENT WORK IN CBIR 

Support vector machines (SVM) are extensively used to learn from relevance feedback due to their 
capability of effectively tackling the above difficulties. However, the performances of SVM depend on the 
tuning of a number of parameters. It is a different approach based on the nearest neighbour paradigm. Each 
image is ranked according to a relevance score depending on nearest neighbour distances. This approach allows 
recalling a higher percentage of images with respect to SVM-based techniques [25] there after quotient space 
granularity computing theory into image retrieval field, clarify the granularity thinking in image retrieval, and a 
novel image rettieval method is imported. Firstly, aiming at the Different behaviours under different 
granularities, obtain color features under different granularities, achieve different quotient spaces; secondly, do 
the attribute combination to the obtained quotient spaces according to the quotient space granularity 
combination principle; and then realize image retrieval using the combined attribute function. [26] Then a 
combination of three feature exttaction methods namely color, texture, and edge histogram descriptor is 
reviewed. There is a provision to add new features in future for better retrieval efficiency. Any combination of 
these methods, which is more appropriate for the application, can be used for retrieval. This is provided through 
User Interface (UI) in the form of relevance feedback. The image properties analyzed in this work are by using 
computer vision and image processing algorithms. 

1, Evaluating an emotional response to color images. It is mainly used for the case - base reasoning 
methodology, emotional evolution of color images values , and also find out fuzzy similarity relational & 
inter and intra similarities and used for MPEG -7 visual descriptors. [27] 
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3D Object: The 3D objects make their efficient retrieval technology highly desired. Intelligent query 
methodology, multiple view and representative query view. [28] 

Relevance Feedback: Another methodology is classify the query in text or images to relevance / 
irrelevance set of images to select the positive images. Reference to retrieve the relevance images from 
databases. [29] 
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Figure: Block diagram of CBIR system 



V. RESEARCH GAP 

There are various areas to work with for the improvement of the content based image retrieval system. 
It is already been discussed that the existing techniques may be used to improve the quality of image retrieval 
and the understanding of user intentions. An approach that combines two different approaches to image 
retrieval, together with active use of context information and interaction has been proposed. The problem of 
bridging the semantic gap between high level query which is normally in terms of an example image and low 
level features of an image such as color, texture, shape and object forced to apply techniques to reduce the 
semantic gap. One approach to making a fair assessment of the state of the field is by comparing CBIR 
applications presented in the literature. 

Survey on recent researches on implementation of CBIR in Medical imaging to the nature and content 
of the data, highlights that it is necessary to develop comparison methods that analyze more than the selection of 
particular techniques and the experimental results presented in the literature. Rather, it may be better to formally 
describe an idealized CBIR system and identify the shortcomings in the candidate system. These shortcomings 
have been labelled as "gaps" and extensively discussed by L.R.Long et.al in [30]. The concept of the gap is a 
generalization of the well-known "semantic gap" that refers to the difficulty of capturing high-level imaged 
content semantics from extracted low-level image features. These gaps have been broadly categorized into four 
types and defined below: 

1. The Content Gap addresses a system's ability to foster human understanding of concepts from 
extracted features. In medical applications, it also refers to the extent to which the system adapts to 
varying modalities, context, and diagnostic protocols. 

2. The Feature Gap addresses the extent to which the image features are extracted. This is measured along 
several dimensions: degree of automation, degree of detail captured along the content axis (object 
structure), use of multi-scalar techniques, the use of space and (if available) time dimension in image 
data, and use of all channels on each dimension. 
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3. The Performance Gap : It addresses practicalities of system implementation and acceptance. It 
evaluates system availability, extent of integration into the medical infrastructure, use of feature 
indexing techniques, and the extent to which the system was evaluated. 

4. The Usability Gap: It measures the richness of available query features and the extent to which they 
can be combined, available support for comprehending the results returned by the system, and available 
support for query refinement. 

Addressing these aspects makes a CBIR system more usable, and may increase its acceptance into the 
medical(clinical,research,or education)workflow. 

VI. CONCLUSION 

This survey is highlighting the significant contributions of content based image & information's 
Retrieval field. The difficulty faced by CBIR methods in making in- roads into medical applications can be 
attributed to a combination of several factors. Some of the leading causes can be categorized according to the 
"gaps" model presented above. An idealized system can be designed to overcome all the above gaps, but still fall 
short of being accepted into the medical community for lack of (i) useful and clear querying capability; (ii) 
meaningful and easily understandable responses; and (iii)provision to adapt to user feedback. The opposite is 
also true to some extent. A technically mediocre, but promising, system may obtain valuable end user feedback, 
and by technical improvement may increase user acceptance with the application of usability design principles. 
Other than item (iii), which still needs significant research effort, the usability gap can only be bridged by 
keeping the end user in mind from early system development, as well as by conducting well designed usability 
studies with targeted users. In general, a high involvement of the user community in system design and 
development can significantly improve adoption and acceptance. The preceding subsections already showed the 
large variability in techniques that are used for the retrieval of images. Still, several very successful techniques 
from the image retrieval domain have not been used for medical images as of yet. The entire discussion on 
relevance feedback that first improved the performance of text retrieval systems and then, 30 years later, of 
image retrieval systems has not at all been discussed for the medical domain. 
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